Clustering subgaussian mixtures by semidefinite programming

نویسندگان

  • Dustin G. Mixon
  • Soledad Villar
  • Rachel Ward
چکیده

We introduce a model-free relax-and-round algorithm for k-means clustering based on a semidefinite relaxation due to Peng and Wei [PW07]. The algorithm interprets the SDP output as a denoised version of the original data and then rounds this output to a hard clustering. We provide a generic method for proving performance guarantees for this algorithm, and we analyze the algorithm in the context of subgaussian mixture models. We also study the fundamental limits of estimating Gaussian centers by k-means clustering in order to compare our approximation guarantee to the theoretically optimal k-means clustering solution.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robustness of SDPs for Partial Recovery of Clustering Subgaussian Mixtures

In this paper, we examine the robustness of a relax-and-round k-means clustering procedure, a method for clustering subgaussian mixtures using semidefinite programming first introduced in [MVW16]. We are interested in the robustness of the algorithm when there is an adversarial corruption of N points each through distance at most R0. We show that under such corruption this specific algorithm we...

متن کامل

A Recurrent Neural Network Model for Solving Linear Semidefinite Programming

In this paper we solve a wide rang of Semidefinite Programming (SDP) Problem by using Recurrent Neural Networks (RNNs). SDP is an important numerical tool for analysis and synthesis in systems and control theory. First we reformulate the problem to a linear programming problem, second we reformulate it to a first order system of ordinary differential equations. Then a recurrent neural network...

متن کامل

A tail inequality for quadratic forms of subgaussian random vectors

This article proves an exponential probability tail inequality for positive semidefinite quadratic forms in a subgaussian random vector. The bound is analogous to one that holds when the vector has independent Gaussian entries.

متن کامل

Semidefinite spectral clustering

Multi-way partitioning of an undirected weighted graph where pairwise similarities are assigned as edge weights, provides an important tool for data clustering, but is an NP-hard problem. Spectral relaxation is a popular way of relaxation, leading to spectral clustering where the clustering is performed by the eigen-decomposition of the (normalized) graph Laplacian. On the other hand, semidefin...

متن کامل

A path following interior-point algorithm for semidefinite optimization problem based on new kernel function

In this paper, we deal to obtain some new complexity results for solving semidefinite optimization (SDO) problem by interior-point methods (IPMs). We define a new proximity function for the SDO by a new kernel function. Furthermore we formulate an algorithm for a primal dual interior-point method (IPM) for the SDO by using the proximity function and give its complexity analysis, and then we sho...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1602.06612  شماره 

صفحات  -

تاریخ انتشار 2016